Fully-Nested Interactive POMDPs for Partially-Observable Turn-Based Games

نویسنده

  • Pol Rosello
چکیده

Interactive POMDPs (I-POMDPs) are a useful framework for describing POMDPs that interact with other POMDPs. I-POMDPs are solved recursively in levels: a level1 I-POMDP assumes the opponent acts randomly, and a levelk I-POMDP assumes the opponent is a level-(k-1) I-POMDP. In this paper, we introduce fully-nested I-POMDPs, which are uncertain about the physical state of the game, the level of their opponent, and the opponent’s belief about both. This paper has three main contributions: it (1) introduces the framework for turn-based fully-nested I-POMDPs and shows how to reduce them to POMDPs; (2) motivates fully-nested I-POMDPs by introducing the game of partially-observable nim and solving it using SARSOP; and (3) shows empirically that increasing the level of a fully-nested I-POMDP does not become intractable for this game.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Anytime Point Based Approximations for Interactive POMDPs

Partially observable Markov decision processes (POMDPs) have been largely accepted as a rich-framework for planning and control problems. In settings where multiple agents interact POMDPs prove to be inadequate. The interactive partially observable Markov decision process (I-POMDP) is a new paradigm that extends POMDPs to multiagent settings. The added complexity of this model due to the modeli...

متن کامل

Generalized and bounded policy iteration for finitely-nested interactive POMDPs: scaling up

Policy iteration algorithms for partially observable Markov decision processes (POMDP) offer the benefits of quick convergence and the ability to operate directly on the solution, which usually takes the form of a finite state controller. However, the controller tends to grow quickly in size across iterations due to which its evaluation and improvement become costly. Bounded policy iteration pr...

متن کامل

Decayed Markov Chain Monte Carlo for Interactive POMDPs

To act optimally in a partially observable, stochastic and multi-agent environment, an autonomous agent needs to maintain a belief of the world at any given time. An extension of partially observable Markov decision processes (POMDPs), called interactive POMDPs (I-POMDPs), provides a principled framework for planning and acting in such settings. I-POMDP augments the POMDP beliefs by including m...

متن کامل

Improved Planning for Infinite-Horizon Interactive POMDPs using Probabilistic Inference (Extended Abstract)

We provide the first formalization of self-interested multiagent planning using expectation-maximization (EM). Our formalization in the context of infinite-horizon and finitely-nested interactivePOMDP (I-POMDP) is distinct from EM formulations for POMDPs and other multiagent planning frameworks. Specific to I-POMDPs, we exploit the graphical model structure and present a new approach based on b...

متن کامل

On the Difficulty of Achieving Equilibrium in Interactive POMDPs

We analyze the asymptotic behavior of agents engaged in an infinite horizon partially observable stochastic game as formalized by the interactive POMDP framework. We show that when agents’ initial beliefs satisfy a truth compatibility condition, their behavior converges to a subjective ǫ-equilibrium in a finite time, and subjective equilibrium in the limit. This result is a generalization of a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016